[inference_metadata_fields] Clear inference results on explicit nulls #119145

Mikep86 · 2024-12-19T21:12:30Z

Fix a bug where setting a semantic_text source field explicitly to null in an update request to clear inference results did not actually clear the inference results for that field. This bug only affects the new _inference_fields format.

Previously, setting a field explicitly to null in an update request did not work correctly with semantic text fields. This change resolves the issue by adding an explicit null entry to the `_inference_fields` metadata when such cases occur. The explicit null value ensures that any prior inference results are overwritten during the merge of the partial update with the latest document version.

…ds_explicit_null_fixes

…on" test to use multiple source fields

...ava/org/elasticsearch/xpack/inference/action/filter/ShardBulkInferenceActionFilterTests.java

Mikep86 · 2024-12-19T21:16:51Z

...rc/yamlRestTest/resources/rest-api-spec/test/inference/60_semantic_text_inference_update.yml

I refactored into two YAML tests so that we can focus on the explicit nulls in one of them

kderusso

Nice work! Overall looks good.

kderusso · 2024-12-19T21:27:37Z

...ain/java/org/elasticsearch/xpack/inference/action/filter/ShardBulkInferenceActionFilter.java

                // ensure that the order in the original field is consistent in case of multiple inputs
                Collections.sort(responses, Comparator.comparingInt(FieldInferenceResponse::inputOrder));
                Map<String, List<SemanticTextField.Chunk>> chunkMap = new LinkedHashMap<>();
                for (var resp : responses) {
+                    // Get the first non-null model from the response list
+                    if (model == null) {


Do we need to do any additional validation here, to verify that model if it exists is compatible with resp.model?

kderusso · 2024-12-19T21:32:37Z

...ain/java/org/elasticsearch/xpack/inference/action/filter/ShardBulkInferenceActionFilter.java

+                        var valueObj = XContentMapValues.extractValue(sourceField, docMap, EXPLICIT_NULL);
+                        if (useLegacyFormat == false && isUpdateRequest && valueObj == EXPLICIT_NULL) {
+                            /**
+                             * It's an update request, and the source field is explicitly set to null,


Nice comment here!

kderusso · 2024-12-19T21:35:30Z

...ain/java/org/elasticsearch/xpack/inference/action/filter/ShardBulkInferenceActionFilter.java

+                             */
+                            var slot = ensureResponseAccumulatorSlot(itemIndex);
+                            slot.addOrUpdateResponse(
+                                new FieldInferenceResponse(field, sourceField, null, order++, 0, null, EMPTY_CHUNKED_INFERENCE)


We're incrementing order here and we're still incrementing it later on on line 566 in existing code. Do we need to reset the value of order before we iterate through values here? It's a bit confusing on read through.

kderusso · 2024-12-19T21:49:21Z

...rc/yamlRestTest/resources/rest-api-spec/test/inference/60_semantic_text_inference_update.yml

+  - match: { hits.total.value: 1 }
+  - match: { hits.total.relation: eq }
+
+  - length: { hits.hits.0._source._inference_fields.sparse_field.inference.chunks: 1 }


I know we check the offsets here but can we check the embeddings as well? (Verify the right chunk was removed, in a more visual manner?)

…ds_explicit_null_fixes

elasticsearchmachine · 2024-12-23T21:56:49Z

Pinging @elastic/search-eng (Team:SearchOrg)

elasticsearchmachine · 2024-12-23T21:56:50Z

Pinging @elastic/search-relevance (Team:Search - Relevance)

jimczi and others added 7 commits December 18, 2024 12:46

improve test

60a4a1d

Create a field inference response with empty chunks

af5a348

Allow model settings to be null

8d0c146

Merge branch 'inference_metadata_fields' into inference_metadata_fiel…

67fb4e1

…ds_explicit_null_fixes

Refactor "Bypass inference on bulk update operation" into two tests

e4fe494

Update "Explicit nulls clear inference results on bulk update operati…

cc1c5de

…on" test to use multiple source fields

Mikep86 added >non-issue :SearchOrg/Relevance Label for the Search (solution/org) Relevance team v9.0.0 labels Dec 19, 2024

Mikep86 requested review from carlosdelest, jimczi and kderusso December 19, 2024 21:12

Mikep86 commented Dec 19, 2024

View reviewed changes

...ava/org/elasticsearch/xpack/inference/action/filter/ShardBulkInferenceActionFilterTests.java Outdated Show resolved Hide resolved

Mikep86 commented Dec 19, 2024

View reviewed changes

kderusso reviewed Dec 19, 2024

View reviewed changes

Mikep86 mentioned this pull request Dec 19, 2024

[inference_metadata_fields] Fix handling of explicit null values for semantic text fields #118947

Closed

Mikep86 added 3 commits December 23, 2024 13:48

Merge branch 'inference_metadata_fields' into inference_metadata_fiel…

6677e8f

…ds_explicit_null_fixes

Fix compilation errors

e69f15a

Fix unit test

5cee11b

Mikep86 marked this pull request as ready for review December 23, 2024 21:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[inference_metadata_fields] Clear inference results on explicit nulls #119145

[inference_metadata_fields] Clear inference results on explicit nulls #119145

Mikep86 commented Dec 19, 2024

Mikep86 Dec 19, 2024

kderusso left a comment

kderusso Dec 19, 2024

kderusso Dec 19, 2024

kderusso Dec 19, 2024

kderusso Dec 19, 2024

elasticsearchmachine commented Dec 23, 2024

elasticsearchmachine commented Dec 23, 2024

[inference_metadata_fields] Clear inference results on explicit nulls #119145

Are you sure you want to change the base?

[inference_metadata_fields] Clear inference results on explicit nulls #119145

Conversation

Mikep86 commented Dec 19, 2024

Mikep86 Dec 19, 2024

Choose a reason for hiding this comment

kderusso left a comment

Choose a reason for hiding this comment

kderusso Dec 19, 2024

Choose a reason for hiding this comment

kderusso Dec 19, 2024

Choose a reason for hiding this comment

kderusso Dec 19, 2024

Choose a reason for hiding this comment

kderusso Dec 19, 2024

Choose a reason for hiding this comment

elasticsearchmachine commented Dec 23, 2024

elasticsearchmachine commented Dec 23, 2024